
In the Claims , 



-Mease cancel all pending claims and add the following new claims 




A file content classification system comprising: 

a pluralitysof agents, each agent including a file content ID generator, at least 
( y ime agent provided on one of a plurality of clients; 

an ID appearance database, provided on a server, coupled to receive file 
content IDs from the agents; and 

a characteristic comparison routine on the server, identifying a characteristic 
of the file content based on tnfe appearance of the file content ID in the appearance 
database, and transmitting the characteristic to the client agents. 

The content classification system of clairn^l wherein said ID generator 
comprises a hashing algorithm. 




N 




The content classification system of claim 22 wherein said hashing algorithm 



is the MD5 hashing algorithm. 



The content classification system of clai 




wherein said ID appearance 



database tracks the frequency of appearance of a digital ID. 

I 



datatos 




The content classification system of claim ^^wherein said plurality of agents 



are coupled to said database via a combination of public and private networks. 




2& . . 

The content classification system of claim ^6 wherein said database is 
coupled to an intermediate server which is coupled to said plurality of agents. 



ti^^he 



The content classification system of claim^^I^herein said intermediate 
server is a web server. 
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The content classification system of claim wherein said characteristic 
comprises junk e-mail ar\d said characteristic is defined by a frequency of appearance 
of a file content ID. 

A method for identifying characteristics of data files, comprising: 

receiving, on a processingvsystem, file content identifiers for data files from a 

plurality of file content identifier generator agents, each agent provided on a source 

system, via a network; \ 

determining, on the processing system, whether each received content 

identifier matches a characteristic of other identifiers; and 

outputting, to at least one of the source systems responsive to a request from 

said source system, an indication of the characteristic of the data file based on said 

step of determining. - \ 

H 




The method of claim ?p wherein said file content identifier generates an 
identifier by hashing at least a portion of the data file. 



The method of claim 40 wherein said hashing comprises using the MD5 hash. 

[0 



The method of claim 4Q wherein said step of generating comprises hashing 
multiple portions of the data file. 






The method of claim ^Avherein each said data file is an email message and 
said step of determining comprises determining whether said email is SPAM. 

v 

The method of claim ?M wherein said step of determining identifies said e- 
mail as SPAM by tracking the rate per ijpit time a digital ID is generated. 

A&. The method of claim 4^f wherein said method further includes the step of 
instructing said plurality of source systems to perform an action with the email based 
on said determining step. 
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A methoayof filtering an email message, comprising: 

receiving, ofv a second computer, a digital content identifier unique to the 
message content from \t least two of a plurality of first computers having digital 
content ID generator agents 

comparing, on the stecond computer, the digital content identifier to a 
characteristic database of digitalNeontent identifiers received from said plurality of 
first computers to determine whetherythe message has a characteristic; and 

responding to a query from at least one of said plurality of computers to 
identify the existence or absence of said characteristic of the message based on said 



compar ing. 




r 

My 



The method of claim wherein said second computer is coupled to said 
plurality of first computers by a combination of public and private networks. 

The method of claim 47 wherein said step of receiving includes receiving 
identifiers from said plurality of |^st systems via an intervening Web server. 

The method of claim ^8 wherein said plurality of systems are coupled by the 

, .. . JsC . 

The method of claim 46 wherein said step of comparing comprises 
determining the frequency of a particular ID occurring in a time period, classifying 
said ID as having a characteristic, and comparing digital content identifiers to said 
classified IDs. 





4- 



A file content classification system, comprising: 
a first system havingV^f^e to be classified; 
a file ID generator onv th\first system outputting at least one file ID value for 
the file based on a generated hash of at least one selected portion of said file; 



-4- 



Attorney Docket No.: PACE-01 000US0 
pace/ 1 000/response-002.doc 



u 



a database on a second system coupled to the ID generator to receive IDs 
generated by the ID generator; and 

a comparison routine on the second system classifying the ID relative to the 
database as meeting^ or not meeting a characteristic. 

The system ofc^aim j^including a plurality of first systems each including a 
respective file ID generator,coimled to the database on the second system. 




The system of claim j^f wherein the plurality of first systems is coupled to the 
birlatiOT^ 



second system via a combir 



)f public and private networks. 




A, 



The system of claim 5a wherein the second system comprises a Web server 
interface system and a database system^wherein the database system is isolated from 
the Interne t by a Web server system. 





A file content classification system for a first computer and a second 
computer coupled by a network, comprising: 

a client agent file content identifier generator on the first computer, the file 
content identifier comprising a computed 'value of at least two non-contiguous 
sections of data in a file; and 

a server comparison agent and data-structure on the second computer 
receiving identifiers from the client agent and providing replies to the client agent; 

wherein the client agent processes the file based on replies from the server 
comparison agent. 



A method iot providing a service on the Internet, comprising: 



collecting data^on a processing system from a plurality of systems having a 
client agent generating digital content identifiers for each of a plurality of files on the 
Internet to a server having a database; 



d^atat 
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characterizing the files on the server system based on said digital content 
identifiers received relative to other digital content identifiers collected in the 
database; and 

transmitting a content identifier from the server to the client agent indicating 
the presence or absence of a characteristic in the file. 

The method of claim 5o wherein said step of collecting comprises collecting a 
digital identifier for a data file. . 

0 





The method of claim 5^ wherein said file content is an e-mail. 

& 

The method of claim SS^wherein said step of characterizing comprises: 
tracking the frequency of the collection of a particular identifier; 
characterizing the data file based on said frequency; 
storing the characterization; and 

comparing collected identifiers to the known characterization. 
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